Detect-and-Track: Efficient Pose Estimation in Videos
نویسندگان
چکیده
This paper addresses the problem of estimating and tracking human body keypoints in complex, multi-person video. We propose an extremely lightweight yet highly effective approach that builds upon the latest advancements in human detection [15] and video understanding [5]. Our method operates in two-stages: keypoint estimation in frames or short clips, followed by lightweight tracking to generate keypoint predictions linked over the entire video. For frame-level pose estimation we experiment with Mask R-CNN, as well as our own proposed 3D extension of this model, which leverages temporal information over small clips to generate more robust frame predictions. We conduct extensive ablative experiments on the newly released multi-person video pose estimation benchmark, PoseTrack, to validate various design choices of our model. Our approach achieves an accuracy of 55.2% on the validation and 51.8% on the test set using the Multi-Object Tracking Accuracy (MOTA) metric, and achieves state of the art performance on the ICCV 2017 PoseTrack keypoint tracking challenge [1].
منابع مشابه
Deep Convolutional Poses for Human Interaction Recognition in Monocular Videos
Human interaction recognition is a challenging problem in computer vision and has been researched over the years due to its important applications. With the development of deep models for the human pose estimation problem, this work aims to verify the effectiveness of using the human pose in order to recognize the human interaction in monocular videos. This paper developed a method based on 5 s...
متن کاملCamera Pose Estimation in Unknown Environments using a Sequence of Wide-Baseline Monocular Images
In this paper, a feature-based technique for the camera pose estimation in a sequence of wide-baseline images has been proposed. Camera pose estimation is an important issue in many computer vision and robotics applications, such as, augmented reality and visual SLAM. The proposed method can track captured images taken by hand-held camera in room-sized workspaces with maximum scene depth of 3-4...
متن کاملPose Estimation and Tracking of Eating Persons in Real-life Settings
We present an approach to estimate and track 2D upper body poses of persons who are having a meal in videos with highly challenging uncontrolled imaging conditions. We employ a probabilistic model that represents the body as a kinematic tree, and perform inference in this kinematic tree model using particle ltering, and also estimates self-occlusions. Our approach is evaluated with 7 di erent v...
متن کاملReal-time RGB-D-based Object and Manipulator Pose Estimation
We present an overview of our recent work on real-time model-based object pose estimation from intensity and depth cues. We have developed a system that can simultaneously track the pose of hundreds of rigid objects. By incorporating proprioceptive information, objects can be tracked together with their robotic manipulator, enabling accurate visual servo-control even in the presence of severe c...
متن کاملComputation of Slip analysis to detect adhesion for protection of rail vehicle and derailment
Adhesion level for the proper running of rail wheelset on track has remained a significant problem for researchers in detecting slippage to avoid accidents. In this paper, the slippage of rail wheels has been observed applying forward and lateral motions to slip velocity and torsion motion. The longitudinal and lateral forces behavior is watched with respect to traction force to note correlatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1712.09184 شماره
صفحات -
تاریخ انتشار 2017